Skip to content

Conversation

@wenju-he
Copy link
Contributor

3-component vector type is supported for them per OpenCL spec.

…rided_copy/prefetch

3-component vector type is supported for them per OpenCL spec.
@wenju-he
Copy link
Contributor Author

@frasercrmck please help to review, thanks

@frasercrmck frasercrmck self-requested a review April 30, 2025 08:59
@frasercrmck frasercrmck added the libclc libclc OpenCL library label Apr 30, 2025
Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I see there's a note in the spec: async_work_group_copy and async_work_group_strided_copy for 3-component vector types behave as async_work_group_copy and async_work_group_strided_copy respectively for 4-component vector types. . I'm not sure what that really means for our implementation which does a loop and a store.

I also wonder why we have async/gentype.inc. With this change, couldn't we just use float/gentype.inc and integer/gentype.inc in succession?

@wenju-he
Copy link
Contributor Author

I see there's a note in the spec: async_work_group_copy and async_work_group_strided_copy for 3-component vector types behave as async_work_group_copy and async_work_group_strided_copy respectively for 4-component vector types. . I'm not sure what that really means for our implementation which does a loop and a store.

OpenCL spec requires that 3-component vector type data has the same size as 4-component vector type data in memory layout.
So the wording above is probably a re-iteration of the alignment fact.

I also wonder why we have async/gentype.inc. With this change, couldn't we just use float/gentype.inc and integer/gentype.inc in succession?

done, thanks for the suggestions. Deleted async/gentype.inc.

Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We could probably come up with some kind of combined gentype.inc that does them both at once (though both currently undef __CLC_BODY after they finish so we'd need to be able to stop that, or preserve __CLC_BODY across gentypes).

Come to think of it, we do a lot of unnecessary undef __CLC_BODY around the codebase. I will try to clean that up in a separate PR.

@wenju-he
Copy link
Contributor Author

We could probably come up with some kind of combined gentype.inc that does them both at once

good idea.

@frasercrmck frasercrmck merged commit 5b6fc61 into llvm:main Apr 30, 2025
9 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…rided_copy/prefetch (llvm#137932)

3-component vector type is supported for them per OpenCL spec.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…rided_copy/prefetch (llvm#137932)

3-component vector type is supported for them per OpenCL spec.
@wenju-he wenju-he deleted the async-vec3 branch June 5, 2025 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libclc libclc OpenCL library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants